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Using the Dominant Reaction Pathways (DRP) method, we perform an ab-initio quantum- 
mechanical simulation of a conformational transition of a peptide chain. The method we propose 
makes it possible to investigate the out-of-equilibrium dynamics of these systems, without resorting 
to an empirical representation of the molecular force field. It also allows to study rare transitions 
involving rearrangements in the electronic structure. By comparing the results of the ab-initio sim- 
ulation with those obtained employing a standard force field, we discuss its capability to describe 
the non-equilibrium dynamics of conformational transitions. 



PQ INTRODUCTION 

o 

The theoretical investigation of conformational transitions of polypeptide chains is usually performed using tech- 
niques like molecular dynamics (MD), Monte Carlo, transition path sampling based on classical Molecular Mechanics 
^^ (MM) approaches |1J. MM methods are computationally very efficient, thus making simulations of molecules with thou- 
sands of atoms feasible on modern computer clusters [2j [3]. In many cases the outcome of the simulations compares 
-H favorably with experimental results [4]. 

^ MM methods are based on an empirical representation of the potential energy function of molecules (the so-called 

force field) which relies on a chemical model of the bonding fitted on quantum calculations and experimental results. 
ivj A force field is defined by the functional form of the different components that make it up, and by the values of a 

i/^ set of parameters appearing in the components. The parameters are usually determined based on the equilibrium 
1^ configuration of molecules. This approach is acceptable when the focus is on small thermal oscillations, but may 
^^ become inadequate when the system undergoes out-of-equilibrium transitions. 

^^ Another limitation of the MM approach arises when the transition involves a rearrangement of the electronic 

^"H structure, as is the case for the cleavage or formation of chemical bonds, like for instance a sulphur bridge. 

^ All these problems could in principle be solved by adopting a quantum mechanical approach to the dynamics of 

the molecule. Given the formidable complexity of a full quantum description, two approximations are usually invoked 

in ab-initio simulations: the Born-Oppenheimer separation of the dynamics of nuclei and electronsPl, and a classical 

^ treatment of nuclear degrees of freedom. 

However, for molecular systems the size of poly-peptide chains, the quantum calculation of the molecular energy of 
a conformation is computationally quite expensive. As a result, the ab-initio quantum- mechanical approach is usually 
adopted to infer static properties or to study the dynamics over very short time intervals, typically up to hundreds 
of picoseconds ^6 . On the other hand, the time scales involved in the conformational transitions of poly-peptide 
chains range from nanoseconds — for the rotations around dihedral angles — to milliseconds or even seconds — for the 
formation of tertiary structures in proteins — . 

In this work, we show that the DRP formalism [THE] provides a rigorous and computationally efficient framework 
which makes it feasible to investigate ab-initio the folding dynamics of peptide chains. 
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The DRP approach is a method which yields the statistically most significant reaction pathways, in systems de- 
scribed by the over-damped Langevin equation. Its computational advantage resides in the fact that it does not waste 
CPU time in simulating thermal oscillations in (meta)-stable states visited during the reaction and that the sampling 
of the transition pathways is performed at constant spatial displacement steps, rather than at constant time steps. 
The reliability of the DRP approach when applied to investigating thermally activated conformational reactions of 
peptide chains within an MM framework has been tested in a series of works based on both atomistic [10] and coarse- 
grained models [TT1[T2]. In particular, in [12 the folding trajectories obtained by means of Molecular Dynamics (MD) 
simulations were compared directly with those calculated in the DRP approach, and the two methods were found to 
give consistent results. 

In [13] the DRP formalism was applied to perform an ab-initio calculation of the dominant pathways in the 
cyclobutene -^ butadiene transition, a thermally activated reaction which involves the breaking and formation of 
covalent bonds. In this approach, the electronic structure and the molecular energy of the chain was determined at 
each step of the calculation, by approximatively solving the Schrodinger equation. 

In the present work, we use the same approach to compute ab-initio the dominant reaction pathway for a confor- 
mational transition of tetra-alanine, leading to the formation of the elementary unit of a helix — see Fig. [2] — . 

Our first goal is to show that by using the DRP formalism the ab-initio simulation of the reaction can be performed 
at a relatively low computational cost. Indeed, the calculation of the most probable pathway connecting a single 
initial and final configuration required about 23,000 CPU hours. 

A second goal of the present work is to address the question whether force fields which are fitted on ab-initio 
calculations of equilibrium properties also agree with quantum mechanical calculations in non-equilibrium conditions. 
To this goal, we compare the dominant reaction pathways obtained ab-initio with those calculated with the AMBER- 
99 force field \^. We find that the classical approximation describes with reasonable accuracy the dynamics also in 
the transition region, producing trajectories which are in semi- quantitative agreement with those obtained ab-initio. 

Finally, as an example of an observable which cannot be computed from classical MM simulations, we analyze the 
evolution of the partial charges of the atoms which are involved in the hydrogen bonds in the helix configuration. 

MODEL 
The ab-initio dominant reaction pathAvays approach 

We consider a generic molecule consisting of N atoms with nuclear coordinates X = (xi, . . . ,XAr), in contact 
with a thermal-bath at temperature T. The atomic nuclei are assumed to be classical point-like particles, evolving 
according to Langevin dynamics. The electrons are coupled quantum mechanically to the nuclear positions, in the 
Born-Oppenheimer approximation, i.e. their wave-function is assumed to instantaneously relax to the ground state, 
for each nuclear configuration X. 

On time-scales larger than a ps the dynamics of atoms in proteins is well described by the over-damped limit of the 
Langevin equation 

X = -^Vf/(X)+r?(t). (1) 

In this Eq., ks is the Boltzmann constant, T is the temperature and D is the diffusion coefficient, which we shall 
assume to be the same for all atoms (the generalization to the case in which each atom has a different diffusion 
coefficient is straightforward). /7(X) is the molecular energy for system with the nuclei in configuration X. In MM 
simulations this quantity is given by the force field. In ab-initio simulations, /7(X) is the sum of the ground-state 
energy of the electron wave-function and of the electrostatic energy of the classical nuclei. 

In Eq. ([l]), 7^(t) is a random noise with Gaussian distribution, zero average and correlation given by 

{r],{t)r]j{t'))=2D6,jS{t-t'), 

where i, j label all the nuclear degrees of freedom. 

Let us now consider the probability for the molecule to be found in the configuration Xj at time tj, provided it 
was prepared in some initial configuration X^ at time ti. Such a conditional probability can be represented in the 
following path integral form — for the details of the derivation see e.g. [8j — : 

P{Xf,tf\Xi,ti)=e 5^^^/ PX(r) e-^'" 1^(^)1, (2) 



where 5'e//[X(r)] is called the effective action functional calculated on the trajectory X(r) and is given by 



*^ , / 1 



5e//[X(r)] = / dr [-— X^t) + T4//[X(r)] . (3) 
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VeffCX.) is called the effective potential, and reads 



K//(X) = ^|£^ [|VC/(X)|' - 2 kBT V2[/(X) ] . (4) 

Notice that, if Xj and X^ are product and reactant configurations respectively, then the factor exp(— S'e//[X]) inside 
the path integral expression ([2| represents the statistical weight of a given reactive trajectory X(t). 

The most probable (or dominant) reaction pathways are those which minimize the effective action functional 
5'e//[X]. Hence, they are the solutions of the classical equations of motion generated by the effective action, i.e. 

X = 2DVT4//(X), (5) 

with boundary conditions X(ti) = X^ and X(t/) = Xj. 

Note that the dynamics described by the equation of motion ibl conserves the effective energy E^ff = ^X^(t) — 
Veff (X(t)) . This property allows to switch from the time-dependent Newtonian description to the equivalent energy- 
dependent Hamilton-Jacobi (HJ) description. In the HJ framework, the most probable pathways connecting the given 
initial and final configurations can be shown to be those which minimize the target HJ functional 



Shj[K] = jj dl^^ [E^ff + V^ff (X(0)], (6) 



where dl = VoX^ is the elementary displacement in configuration space, along the dominant reaction path. Note 
that the dominant trajectory in Eq. ^ is parametrized in terms of the curvilinear abscissa /, which plays the role of 
the reaction coordinate and measures the total distance covered along the reaction pathway. 

The computational difficulty of investigating thermally activated transitions by ordinary MD simulations is related 
to the decoupling of the time scales characterizing the dynamics of the system. The computational advantage of 
the DRP approach with respect to MD simulations comes from the fact that, by switching to the HJ formulation, 
the time variable t has been replaced by the curvilinear abscissa /. The key point is that molecular systems are 
not characterized by a decoupling of the intrinsic length scales. As a result, typically only A^^ = 10 — 100 space 
displacement steps are sufficient to attain a realistic representation of the path. This number should be compared 
to the 10^ — 10^^ MD time steps required to simulate a single transition with mean-first-passage time in the /is — 
ms range. Ultimately, such a huge computational gain originates from the fact that the DRP does not waste time to 
simulate the dynamics of the system when it is trapped in metastable states. 

Although in the HJ formulation the time variable has been replaced by the curvilinear abscissa /, the DRP formalism 
retains information about the time evolution of the system. Indeed, the time t [X] at which the configuration X is 
visited, during the most probable reaction pathway is given by 

^ 1 



t(X)=/ dl , ^. (7) 

A. V4 D [E,ff + Veff (X(0)] 

From this equation it follows that the choice of the effective energy parameter Eeff determines the total time of the 
transition. In particular, the longest possible transition path time is obtained by choosing Eeff = — Fe//(X/) [SlfTT]. 
which is a positive number if Xj is an equilibrium configuration. We stress the fact that the total time ttot = ^(X/) 
is much shorter than the mean- first-passage time, as it corresponds to the time it takes to reach the product, once 
the system has left the reactant state. 

Once the dominant path has been determined, it is also possible to identify the configuration X^^ which belongs 
to the transition state, defined in terms of commitment analysis. This is achieved by requiring that the probability 
in the saddle-point approximation to diffuse back to the initial configuration X^, P(X^, t/|Xts, t^) equates that of 
evolving toward the final configuration X/, P(X/, t/|Xts, t^). In the saddle-point approximation, this condition leads 
to the simple equation [lOj: 

^™fcBr ^"^'^ = 5gj([X(0];X,„X,) - ggj([X(0];X,„Xy). (8) 



Details of the simulations 

We have studied the folding of tetra-alanine at a temperature T = 300i^. We discretized the path using Ns = 16 
equal displacement slices. The effective energy parameter E^fj was chosen to be slightly larger than the value 
corresponding to the longest possible transition time, i.e. 

Eeff = -I Kffi^f)- (9) 

The molecular energy in the quantum simulations, /7(X), was evaluated by solving the Schrodinger Eq. in the 
Parameterized Model 3 (PM3) scheme, in the MOPAC implementation [14 . The choice of a semi-empirical quantum 
mechanical method was made because it combines a very low computational cost with a reasonable description of 
the hydrogen bond energetics [151116]. In addition, parameterized model quantum mechanical calculations have been 
recently shown to reliably describe various types of non-covalent complexes [17 . 

Since one of the purposes of this work is to compare classical and quantum descriptions of the inter-atomic interac- 
tions, we chose to neglect solvation terms in both the MM and ab-initio simulations. However, the inclusion of such 
contributions at the implicit level in the quantum-mechanical simulations does not lead to a significant increase of 
the computational cost, as the bottleneck of the calculation is the solution of the Schrodinger Eq., at each step of the 
minimization. 

Finding the dominant reaction pathway amounts to minimizing a discretized version of the effective HJ functional: 



where the effective potential Ve//(X) is determined according to Q by numerically differentiating the molecular 
potential energy /7(X). A/^^^+i is the Euclidean distance between the slices i and z + 1, i.e A/^^^+i = a/|X^+i — X^| . 



In the discretized representation of the HJ effective action (10), the width of the distribution of the Euclidean 
distances between consecutive path slices, A/^ ^+i, should not be allowed to increase in an uncontrolled way, in order 
to prevent all frames to collapse into the reactant or product configurations. As discussed in ^3], the most convenient 
way to achieve this is to introduce a Lagrange multiplier, which holds fixed at 0.2 the ratio between the mean-square 
deviation from the average of the inter-slice distances a^ of the average square inter-slice distance (AP). 



The global minimization of the HJ effective action ( 10 ) is in general a very challenging task. The main difficulties 
arise from the ruggedness of the effective potential and the high dimensionality of the system. Indeed, most commonly 
used global optimization algorithms — such as e.g. simulated annealing — tend to get stuck in secondary minima 
of the action functional. On the other hand, the results of a DRP calculation can be considered reliable only if the 
minimization algorithm explores a significant region of the path space. In fact, in the opposite scenario the calculated 
dominant paths would be strongly biased by the choice of the initial trial path. 

In our previous work fT3 we tested the performances of several global minimization algorithms, and we found that 
the Fast Inertial Relaxation Engine (FIRE) [18] method was performing best. The minimization protocol which was 
adopted in the present work was the following: we started from a high-temperature (800 K) MD trial unfolding path, 
from the helix configuration. We then relaxed the path by means of a Nudged Elastic Band (NEB) [19 minimization, 
followed by a zero-temperature DRP minimization, and finally by a finite temperature DRP minimization. 



RESULTS 

A sequence of configurations which are visited by the dominant reaction pathway determined from ab-initio calcu- 
lations is shown in Fig. [2J The dashed circle highlights the configuration along the path which is representative of 
the transition state, according to Eq. (Isl). 

Polyalanine chains form a-helices, stabilized by a sequence of i — i -\- 4 hydrogen bonds. On the other hand, in 
the tetra-alanine molecule there are significant termination effects, and the minimum energy configuration is slightly 
distorted from that of an ideal a-helix. In particular, the hydrogen bonds stabilizing the tetra-alanine system occur 
between the — 6 and the H — 28 atoms and the — 16 and the H — 38 atoms. 

In order to perform a quantitative analysis of the transition, let us study the evolution along the dominant path of 
the following order parameters — see Fig. [l] — : 



• the distance (^6-28 between the O — 6 and the H — 28 atoms 

• the distance (iie-ss between the — 16 and the H — 38 atoms 

• the dihedral angle (j)i between the atoms C — 5,A/' — 7, C — 9,C — 15 

• the dihedral angle (j)2 between the atoms C — 25, A/" — 27, C — 29, C — 35. 

In Fig. [3] we compare the evolution of these quantities, as a function of the reaction coordinate / obtained from 
classical and ab-initio simulations. We also plot the initial high-temperature MD path and the path obtained at the 
end of the preliminary relaxation based on the NEB algorithm — cfr. the discussion in the section "Model" — . In Fig. 
[4] we show the same dominant paths, projected onto the planes selected by the distances (^6-28 and (iie-ss involved 
in the formation of hydrogen bonds. 

Some comments on these results are in order. First of all, these plots clearly show that the FIRE algorithm 
allows to sizeably move away from the initial trial path, irrespective of the order parameter used to characterize the 
transition. Secondly, we observe that the dominant paths obtained from MM and ab-initio simulations agree at an 
almost quantitative level. In particular, both calculations predict that the contact between the C — 6 and the H — 28 
atoms is formed before the contact between the — 16 and the H — 38 atoms. However, in the MM calculation, the 
formation of the second contact occurs at a slightly later stage of the reaction than in the quantum calculation. We 
emphasize that the observed good agreement between classical and quantum calculations is not due to the insufficient 
exploration of the path space. In fact, such dominant paths are qualitatively different from those obtained in the last 
stage of the preliminary NEB minimization. Thus, these results clearly imply that the AMBER-99 classical force 
field provides a rather accurate description of the dynamics, even in non-equilibrium conditions. 

The quantum calculation provides additional physical information, not accessible by means of MM simulations. For 
example, in Fig. [5]we plot the evolution of the partial charges of O — 6 and H — 28 and of O — 16 and H — 38 during the 
reaction. We recall that these pairs of atoms form hydrogen bonds in the final helix state. By cross-correlating this 
information with the evolution of the interatomic distance — cfr. Fig. [5] — we can infer that a significant modification 
of the electronic structure of the — 6 and H — 28 atoms sets in when they are separated by a distance of the order 
3 A. Interestingly, the electronic structure of the — 16 and H — 38 begin to change when these two atoms are 
separated by a much larger distance, of the order of 4 A. These results exhibit an example of the fact that the validity 
of some of the assumptions involved in MM calculations — such as the invariance of partial charge — may depend on 
the detail of the chemical environment in which the bond is formed and on the specific non-equilibrium dynamics of 
the reaction. 

Using Eq. ([7| it is in principle possible to obtain information about the dynamics, i.e. to compute the time at 
which each of the configurations of the dominant path is visited during the course of the transition. In particular, 
assuming a diffusion constant D = 2 ■ 10~^ A^ ps~^ for all atoms, as in [10], the total transition path times in the 
classical and quantum calculations are found to be t classical = 1.4 ps and t ab-initio = 2.3 ps, respectively. We note, 
however, that these numbers should be taken with care, because in the present exploratory calculation we used only 
Ns = 16 path discretization steps, and Eq. (JTl) is known to be quite sensitive to discretization errors. 

DISCUSSION 

In this work, we have used the DRP method to perform the first simulation of the folding reaction of a peptide 
chain, based on a ab-initio quantum-mechanical approach. We have calculated the most statistically probable reaction 
pathway connecting an initial coil configuration and a final helix configuration, assuming the over-damped Langevin 
dynamics. 

By comparing the results of the ab-initio simulation in the PM3 scheme with MM simulations with the AMBER-99 
force field, we argue that MM approaches provide a quite reliable description even for out-of-equilibrium dynamics. 
We also studied the evolution of the partial charges involved in the formation of two hydrogen bonds stabilizing the 
helix, and found that the dynamics of these observable depends on the chemical environment. 

The similarity of the quantum and classical paths suggests a "perturbative" approach to performing ab-initio DRP 
simulations at a much smaller computational cost. Indeed, our results show that the minimum of the classical HJ 
functional lies in the vicinity of the minimum of the ab-initio HJ functional. A first approximate dominant pathway 
can be calculated using the classical DRP approach, and used as a starting point for a further local minimization, in 
the ab-initio framework. We expect that a few quantum mechanical minimization steps should be enough to reach 
convergence. 



We note that in order to characterize the folding mechanism, it should be taken into account that the structure of 
the folding pathways may significantly depend on the specific initial condition from which the transition is initiated. 
Hence, an exhaustive study of the folding dynamics of this system requires a statistical analysis of an ensemble of 
folding trajectories, corresponding to different initial conditions, as discussed in [11 ^. 

We conclude this paper by discussing the computational cost of performing a similar calculation for a larger system, 
for instance a 15-residue /3-hairpin in implicit solvent. The scaling of MOPAC energy calculation with the number of 
atoms can be made linear for large molecules by introducing a cut-off for matrix elements of the Hamiltonian between 
orbitals on atoms beyond a certain distance. Using a discretization of the path in Ns=100 slices, we estimate that a 
ab-initio DRP calculation could be carried out using approximately 500,000 total CPU hours per trajectory. 

While this is a substantial amount of computing time, we point out that it can be achieved on existing large 
computing facilities. 

S. a Beccara, P. Faccioli, F. Pederiva and M. Sega are members of the Interdisciplinary Laboratory for Computa- 
tional Sciences (LISC), a joint venture of the University of Trento and Fondazione Bruno Kessler. F. P. thanks the 
Quantum Simulations Group at Lawrence Livermore National Laboratory for providing access to computing facilities. 
Part of the calculations was performed on the WIGLAF cluster at the University of Trento. 
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FIG. 1: The definition of tfie order parameters ^6-28, diess, 0i and 02 used to analyze tfie dominant reaction patfiways. 









X, 



FIG. 2: Configurations on the dominant reaction pathway calculated ab-initio. The colors on the surface represent the projection 
of the molecular electro- static potential on the solvent accessible surface. 
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FIG. 3: Upper-left panel: Evolution of the order parameter ^6-28 as a function of the reaction coordinate / for four different 
paths. Upper-right panel: Evolution of the order parameter die -38 as a function of the reaction coordinate / for four different 
paths. Lower-left panel: Evolution of the order parameter 01 as a function of the reaction coordinate I for four different paths. 
Lower-right panel: Evolution of the order parameter 02 as a function of the reaction coordinate / for four different paths. 
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FIG. 4: Comparison of the dominant reaction pathways obtained from classical and ab-initio DRP simulations, projected onto 
the plane selected by the ^6-28 and diess order parameters 
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FIG. 5: Evolution of the partial charges of the O — 6 and H — 28 atoms (upper panel) and of the O — 16 and H — 38 atoms 
(lower panel), along the dominant reaction pathway. In both figures, the partial charge of the H atoms has been shifted by 
—0.42 atomic units for seek of graphical clarity. 



